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^^ ' DNA torsion dynamics is essential in the transcription process; a simple model for it, in reasonable 

5h , agreement with experimental observations, has been proposed by Yakushevich (Y) and developed by 

MH' several authors; in this, the DNA subunits made of a nucleoside and the attached nitrogen bases are 

described by a single degree of freedom. In this paper we propose and investigate, both analytically 

and numerically, a "composite" version of the Y model, in which the nucleoside and the base are 

described by separate degrees of freedom. The model proposed here contains as a particular case 

the Y model and shares with it many features and results, but represents an improvement from both 

the conceptual and the phenomenological point of view. It provides a more realistic description of 

DNA and possibly a justification for the use of models which consider the DNA chain as uniform. 

It shows that the existence of solitons is a generic feature of the underlying nonlinear dynamics and 

PQ I is to a large extent independent of the detailed modelling of DNA. The model we consider supports 

_• . solitonic solutions, qualitatively and quantitatively very similar to the Y solitons, in a fully realistic 

, ^Z^ ' range of all the physical parameters characterizing the DNA. 

^ : I. INTRODUCTION 

>, 

\l , The possibility that nonUnear excitations - in particular, kink solitons or breathers - in DNA chains play a functional 

role has attracted the attention of biophysicists as well as nonlinear scientists since the pioneering paper of Englander 
et al. [13, and the works by Davydov on solitons in biological systems 14] . 
^^ ■ A number of mechanical models of the DNA double chain have been proposed over the years, focusing on different 

\^ [ aspects of the DNA molecule and on different biological, physical and chemical processes in which DNA is involved. 
(T^ . Here we will not discuss these, but just refer the reader to the discussions of such attempts given in the book by 

"^ ' Yakushevich [S^ and in the review paper by Peyrard [S^ (see also the conference [S^), also for what concerns earlier 
. ,—i , attempts which constituted the basis on which the models considered below were first formulated. |63| Similarly, we 
i-*p ' will not describe the structure and functioning of DNA, but just refer e.g. to |3,[l3,lia|- See also l3l,|43| for the role 
qh' of Nonlinear Dynamics modelling in the understanding of DNA, and [SJ, |43| for DNA single-molecule experiments 
1" • (these were initiated about fifteen years ago [43, but their range and precision has dramatically increased in recent 
years; the formation of bubbles in a double-stranded DNA has been observed in P]). 

In recent years, two models have been extensively studied in the Nonlinear Physics literature; these are the model 
Vh ' by Peyrard and Bishop [4ll (and the extensions of this formulated by Dauxois 13] and later on by Barbi, Cocco, 
. . , Peyrard and Ruffo |3jI31) see also Cocco and Monasson [ij. More recent advances are discussed in 39i] and |5l[ia.l5]|l 
and the one by Yakushevich [52| ; we will refer to these as the PB and the Y models respectively. 

Original versions of these models are discussed in ^] ; they are put in perspective within a "hierarchy" of DNA 
models in [53 ■ An attempt to blend together the two is given in [531 ; see also |31|]. Interplay between radial and 
torsional degrees of freedom is considered more organically in [j, |j] . 

The PB model is primarily concerned with DNA denaturation, and describes degrees of freedom related to "straight" 
(or "radial") separation of the two helices which are wound together in the DNA double helical molecule. On the 
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other hand, the Y model - on which we focus in this note - is primarily concerned with rotational and torsional 
degrees of freedom of the DNA molecule, which play a central role in the process of DNA transcription |M |. 

In this model, one studies a system of nonlinear equations which in the continuum limit reduce to a double sine- 
Gordon type equation; the relevant nonlinear oscillations are kink solitons - which are solitons in both dynamical 
and topological sense - which describe the unwinding of the double helix in the transcription region. The latter is 
a "bubble" of about 20 bases, to which RNA Polymerase (RNAP) binds in order to read the base sequence and 
produce the RNA Messenger: the RNAP travels along the DNA double chain, and so does the unwound region. The 
proposal of Englander et al. [l3| was that if the nonlinear excitations are not created or forced by the RNAP but are 
anyway present due to the nonlinear dynamics of the DNA double helix itself, a number of questions - in particular, 
concerning energy flows - receive a simple explanation. 

The Y model has been studied in a number of paper, in particular for what concerns its solitonic solutions; here we 
will quote in particular |2(]l l2lL l29ll53 . ISJL l57l | . It has been shown that it gives a correct prediction of quantities related 
to small amplitude dynamics, such as the frequency of small torsional oscillations; as well as of quantities related to 
fully nonlinear dynamics, such as the size of solitonic excitations describing transcription bubbles |2^I5m|. Moreover, 
in its "helicoidal" version, it provides a scenario for the formation of nonlinear excitation out of linear normal modes 
lying at the bottom of the dispersion relation branches ^]. On the other hand, if we try to fit the observed speed of 
waves along the chain J5g| , this is possible only upon assuming unphysical values for the coupling constants |57| . 

The Y model is also a very simple one, and adopts the same kind of simplification as in the PB model. In particular, 
two quite strong features of the models are: 

• (a) there is a single (angular) degree of freedom for each nucleotide; 

• (6) all bases are considered as identical. 

These were in a sense at the basis of the success of the model, in that thanks to these features the model can be 
solved exactly and one can check that predictions allowed by the model correspond to the real world situations for 
certain specific quantities. But the features mentioned above are of course not in agreement with the real situation. 

Indeed, it is well known that bases are quite different from each other, and in particular purines are much bigger than 
pirimidines; hence feature (b) - albeit necessary for an analytical treatment of the model - is definitely unrealistic. 
Moreover, it is quite justified to consider several groups of atoms within a single nucleotide (the phosphodiester chain, 
the sugar ring, and the nitrogen base) as substantially rigid subunits; but these - in particular the sugar ring - have 
some degree of flexibility, and what's more they have a considerable freedom of displacement - in particular for what 
concerns torsional and rotational movements - with respect to each other. Thus, even in a simple modelling, feature 
(a) is not justified per se, and it seems quite appropriate to consider several subunits within each nucleotide. In 
this sense, we will speak of a composite Yakushevich (Y) model. Needless to say, if such a more detailed modelling 
would produce results very near to those of the simple Y model, this should be seen as a confirmation that the latter 
correctly captures the relevant features of DNA torsional dynamics - hence justify a posteriori feature (a) of the Y 
model. 

In this work we propose and study a composite Y model (in the sense mentioned above), in which we describe 
with two independent angular degrees of freedom, the nucleoside (i.e. the segment of the sugar-phosphate backbone 
pertaining to the nucleotide) and the nitrogen base in each nucleotide. 

The purpose of our study of such a model is to shed light on the following points. 

• (A) We want to take care (to some extent) of correcting feature (a) above and thus investigating - by comparing 
results - how justified is the original Y modelling in terms of one degree of freedom per nucleotide. 

• (B) We aim at opening the way to correct - or justify - feature (b) of the Y model. Indeed, in our model we 
will consider separately the part of the nucleotide which precisely replicates identical in each nucleotide (the 
unit of the sugar-phosphate backbone), and the part which varies from one nucleotide to the other (the bases). 
We will then be able to study the different role of the two. 

• (C) We want to check the dependence of the solitonic solution of the model both from the geometry and from 
the value of the physical parameters chosen. In particular, we would like to understand how far the existence 
of solitons is a generic feature of DNA and if a more realistic choice of the model geometry is consistent with 
phenomenologically acceptable values of the physical parameters. 

It will turn out that the Y model, which can be considered as a particular case of our model, captures the essential 
features of DNA nonlinear dynamics. The more realistic geometry of the model we use in this paper enables a drastic 
improvement of the descriptive power of our model at both the conceptual and the phenomenological level: on the 
one hand the composite Y model keeps almost all the relevant features of the Y model, but on the other hand it 
allows for more realistic choice of the physical parameters. 



It will turn out that the different degrees of freedom we use play a fundamentally different role in the description 
of DNA nonlinear dynamics. The backbone degrees of freedom are "topological" and play to some extent a "master" 
role, while those associated to the base are "nontopological" and are (in fluid dynamics language) "slaved" to former 
ones. This opens an interesting possibility, i.e. to consider a more realistic model, in which differences among bases are 
properly considered, as a perturbation of our idealized uniform model. As the essential features of the fully nonlinear 
dynamics are related only to backbone degrees of freedom, we expect that such a perturbation - albeit with relevant 
difference in the quantitative values of some parameters entering in the model (the base dynamical and geometrical 
parameters) - will show the same kind of nonlinear dynamics as our uniform model studied here. 

The paper is organized as follows. In Sect, ^we will briefly review some basic know facts about the DNA structure 
and modelling. In Sects. Illll and llVl we will set up our model, describe the interaction and write down the equations 
of motions that govern its dynamics. The physical parameters characterizing our model are discussed in Sect. In 
Sect. I VII we discuss the linear approximation of our dynamical system, in particular its dispersion relations. In Sect. 
IVIII we will set up the framework for the investigation of the nonlinear dynamics and the topological excitations of 
our model. In sect. IVIIII wc will show how the Y model and Y solitons emerge as a particular case of our composite 
Y model and its solitons. Sect. llXI we investigate and derive numerically the solitonic solutions of our model. Finally 
in Sect. 0we summarize our work and present our conclusions. 

II. DNA STRUCTURE AND MODELLING 

DNA is a gigantic polymer, made of two helices wound together; the helices have a directionality and the two helices 
making a DNA molecule run in opposite direction. We will refer for definiteness to the standard conformation of the 
molecule (B-DNA); in this the pitch of the helix corresponds to ten base pairs, and the distance along the axis of the 
helix between successive base pairs is (5 = 3.4 A. 

The general structure of each helix can be described as follows. The helix is made of a sugar-phosphate backbone, 
to which bases are attached. The backbone has a regular structure consisting of repeated identical units (nucleosides); 
bases are attached to a specific site on each nucleoside and are of four possible types. These are either purines, which 
are adenine (A) or guanine (G), or pyrimidines, which are cytosine (C) or thymin (T) in DNA. It should be noted 
that the bases are rather rigid structures, and have an essentially planar configuration. A unit of each helix is called 
a nucleotide; this is the complex of a nucleoside and the attached base. 

The winding together of the two helices makes that to each base site on the one helix corresponds a base site on 
the other helix. Bases at corresponding sites form a base pair; each base has only a possible partner in a base pair; 
bases in a pair are linked together via hydrogen bonds (two for A-T pairs, three for G-C pairs). The base pairs can be 
"opened" quite easily, the dissociation energy for each H-bond being of the order of 0.04 eV, hence AE ~ 0.1 eV per 
base pair. Opening is instrumental to a number of processes undergone by DNA, among which notably transcription, 
denaturation and replication. 

The backbone structure (see Fig.l) is made of a phosphodiester chain and a sugar ring. To one of the C atoms 
of the sugar ring is attached a base; this is one of the four possible bases A, C, G, T, whose sequence represents the 
information content of DNA and is different for different species, and to some extent for each individual. Thus, each 
helix is made of a succession of identical nucleosides, and attached bases which can be different at each site. Bases at 
corresponding sites on the two helices form a base pair, and these can be only of two types, G-C and A-T. 

The atoms on each helix are of course held together by covalent bonds; apart from these, other interactions should 
be taken into account when attempting a description of the DNA molecule. The backbone structure has some rigidity; 
in particular, it would resist movements which represent a torsion of one nucleoside with respect to neighboring ones. 
We will refer to the interaction responsible for the forces resisting these torsion as torsional interactions. 

As already mentioned, the two bases composing a base pair are linked together by hydrogen bonds; we will refer to 
the interaction mediated by these as pairing interaction. Each base interacts with bases at neighboring sites on the 
same chain via electrostatic forces (bases are strongly polar); these make energetically favorable the conformation in 
which bases are stacked on top of each other, and therefore are referred to as stacking interaction. 

Finally, water filaments - thus, essentially, bridges of hydrogen bonds - link units at different sites; these are also 
known as Bernal- Fowler filaments |l4| . In particular, they have a good probability to form between nucleosides or 
bases which are half-turn of the helix apart on different chains, i.e. which are near to each other in space due to 
the double helix geometry; these water filaments-mediated interactions are therefore also called helicoidal interactions 
[65]. We stress that these are quite weaker than other interactions, and can be safely overlooked when we consider the 
fully nonlinear regime. They are instead of special interest when discussing small amplitude (low energy) dynamics, 
as - just because of their weakness - they are easily excited and introduce a length scale in the dispersion relations 
(see below). 

If we consider large amplitude deviations from the equilibrium configurations, then motions will not be completely 
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FIG. 1: The structure of a DNA helix. A nucleotide is shown; nitrogen bases are attached to the Ci atom in the sugar ring. 

free: the molecule is densely packed, and the presence of the sugar-phosphate backbone - and of neighboring bases - 
will cause steric hindrances to the base movements. In particular, for the rotations in a plane perpendicular to the 
double helix axis, the bases will not be able to rotate around the Ci atom for more than a maximum angle (po without 
colliding with the nucleoside. 

This will lead of course to complex behaviors as the DNA helix gets unwound; in particular, as ip gets near to its 
limit value tpo we expect some kind of essentially (if not mathematically) discontinuous behavior. This should not be 
seen as shortcoming of the model: it is indeed well known that bases rotate in a complex way while flipping about 
the DNA axis (see e.g. 12 ). 

Finally, we mention that here we consider a DNA molecule without taking into account its macro-conformational 
features; that is, we consider an "ideal" molecule, disregarding supercoiling, organization in istones, and all that 9]. 

III. COMPOSITE Y MODEL 



As mentioned above, we will model the molecule as made of different parts (units), each of them behaving as a 
single element, i.e. as a rigid body. We consider each nucleoside N as a unit, to which a base B (considered again as 
a single unit) is attached. 




FIG. 2: A base pair in our model. The origin of the coordinate system is in O. The angles 6\ between the lines AO and 
AB and 62 between A'O and A'B' correspond to torsion of sugar-phosphate backbone with respect to the equilibrium B-DNA 
conformation; the angles ipi between the line AB and the line BC, and (p2 between A'B' and B'C correspond to rotation of 
bases around the C\ — N bond linking them to the nucleotide. All angles are in counterclockwise direction; thus the angles 62 
and ifi2 in the figure are negative. 



General features 



We will hence model each of the helices in the DNA double chain as an array of elements (nucleotides) made of 
two subunits; one of these subunits model the nucleoside, the other the nitrogen base. We will consider the bases as 
all equal, thus disregarding the substantial difference between them jg^- The chains - and thus the arrays ~ will be 
considered as infinite. 

We will use a superscript a = 1,2 to distinguish elements on the two chains, and a subscript i G Z to identify 
the site on the chains. Thus the base pairing will be between bases Bi ' and Bi , while stacking interaction will be 
between base Bf"' and bases B^^i and Bi_^. 

We will consider each nucleoside A'^^"' as a disk; bases will be seen as disks themselves, with a point on the border 
of Bj attached via an inextensible rod to a point Pc on the border of N^'; these points on B and N represent the 
locations of the N atom on B and of the Ci atom on N involved in the chemical bond attaching the base to the 
nucleoside. The rod can rotate by an angle ±1^0 before B collides with N; on the other hand, the disk N can rotate 
completely around its axis. 

We also single out a point ph on the border of the disk modelling the base; this represents the atom(s) which form 
the H bond with the corresponding base on the other DNA chain. 

The disks (i.e. the elements of our model) are subject to different kinds of forces, corresponding to those described 
above: torsional forces resisting the rotation of one disk Nf"' with respect to neighboring disks N^^^ and Nyt\ on 



the same chain; stacking forces between a base Bi and neighboring bases B\_^^ and B\_^ on the same chain; pairing 

le same base pair; and finally, helii 
Bernal- Fowler filaments linking bases B] and B\2^ (and B\ and SJ-ts). 



forces between bases BI and B\ in the same base pair; and finally, helicoidal forces correspond to hydrogen bonded 



B. The Lagrangian 

We should now translate the above discussion into a Lagrangian defining our model. This will be written as 

L = T - {Ut + Us + Up + Uh) (III.l) 

where T is the kinetic energy, and Ua are the potential energies for the different interactions listed above, i.e. 

• Ut is the backbone torsional potential, 

• Us is the stacking potential, 

• Up is the pairing potential, 

• Uh is the helicoidal potential. 



These will be modelled by two-body potentials, for which we use the notation Va, to be summed over all interacting 
pairs in order to produce the Ua- 

We denote by / the moment of inertia (around center of mass) of disks modelling the nucleosides, and by Is the 
moment of inertia of bases around the C'l atom in the sugar ring; as the bases can not rotate around their center of 
mass, this is equal to mr^, where m is the base mass and r is the distance between the Ci atom in the sugar and the 
center of mass of the base. 

C. The degrees of freedom 

We are primarily interested in the torsional dynamics. Thus, for each element we will consider torsional movements, 
hence a rotation angle (with respect to the equilibrium conformation, which we take for definiteness to be B-DNA); 
these will be denoted as 6*^ for the nucleoside N^"" , and fl" for the base Bj . Only these rotations will be allowed 
in our model. All angles will be positive in counterclockwise sense. 

The angles represent a torsion of the sugar-phosphate backbone with respect to the equilibrium configuration; 
thus they are related to unwinding of the double helix. On the other hand, the angles ip represent a rotation of the 
base with respect to the corresponding nucleoside; the motion described by ip can be thought of as a rotation around 
the Ci atom in the sugar ring. Note that the hindrances due to the presence of backbone atoms constrain rotation of 
the base around the Ci atom. 

Thus, as mentioned above, the angles will have a different range of values: 

el"^ e R , ip\^^ e [-V'o, (po] ipo<Tr . (111.2) 

The actual value of (po is not essential. The important feature is that the base can not rotate freely around the Ci 
atom, but only pivot between certain limits. At the level of the numerical analysis the simplest way to implement the 
previous boundary condition is to use a confining potential, which reproduces approximately the form of a box. For 
this reason in Sect. IIXI we will add to the Hamiltonian of the system a confining potential Vw = Ktan^ (p'^°'\ 

It should be stressed that - just on the basis of these different ranges of variations - there will be a substantial 
difference between the degrees of freedom described by the angles: those described by 6 angles will be topological 
degrees of freedom, while those described by if angles will only describe local and (relatively) small motions - hence 
ip describe non topological degrees of freedom. 

D. Cartesian coordinates 

In computing the kinetic energy, it will be convenient to consider cartesian coordinates. With reference to Fig. |21 
the cartesian coordinates in the {x, y) plane orthogonal to the double helix axis of relevant points will be as follow. 

The center of disks, representing the position of the phosphodiester chain, will be [xo , yo ) ; the point on the 

border of the disks representing the Ci atom to which the base disks are attached will be (xc , yc ) ■ The center 

of mass of the bases will be {x^" , y)^ ) , and the point on the border of the disks modelling bases representing the 

atom(s) forming the H bonds will be {x^\y^ ). 

In terms of the {0, Lp} angles, these are given by (we omit the site index i for ease of writing, and give condensed 
formulas for the two chains, with first sign referring to chain 1): 



(1.2) (1,2) 




Xo' ' ^ =Fa, Vo 


= 0; 


xi'^'^=xi'^'^±Rco8{9i^-^)), y^'''^ 


= ±i?sin(0(i'2)). 


X^') = xi^^') ± r COS(0(1>2) + ^(1,2))^ ^(1.2) 


= y(^^')±rsin(0(i>2) 


x^^)=x(^'^)±d,cos(0(i^2)+^(i^2)), yi'-'^ 


= y^'''^±d^sin(0(i^2 



+ ^(1^2))^ 

Here and in the following we denote by R the radius of disks describing nucleosides, i.e. the length of the segments 
AB and A'B' in Fig. [21 (this is the distance from the phosphodiester chain to the Ci atom); by r the distance between 
the center of mass of bases and the border of the disk modelling the nucleoside (i.e. the Ci atom). We also denote by 
dh the lengths (supposed equal) of the segments BC and B'C joining the Ci atom on the nucleoside and the atoms 
of the bases forming the hydrogen bond linking this to the complementary base. The parameter a corresponds to the 
distance between the double helix axis and the phosphodiester chain, whereas po is the distance between points C and 
C in the equilibrium configuration. The previous parameters are obviously related by the equation 2a = 2R+2dh + po. 



E. Kinetic energy 

With this notation, and standard computations, the kinetic energy of each nucleotide is written as 



T,- 



(a) _ 



1 



mr^ip"^ + 2mr{r + Rcosip)9ip+ [l + mB{R^ +r^) + 2mRrcosip) (P' 



where we have suppressed super- and sub-scripts for case of reading. Thus, the total kinetic energy for the double 
chain is 
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/ + niBiR^ + r2) + 2mRrcos{ip\''^)] (e^^A' 



We have thus considered a general class of composite Y models; so far we have not specified the interaction 
potentials, which are needed to have a definite model. 

F. Modelling the interactions 

We have now to specify our model by fixing analytical expressions for terms modelling potential interactions in the 
lagrangian l|III.ip . As one of our aims is to compare the results obtained by a composite Y model with those obtained 
with the simple Y model, we will make choices with the same physical content as those made by Yakushevich. 

1. Torsional interactions 

Torsional forces will depend only on difference of angles (measured with respect to the equilibrium B-DNA config- 
uration) of neighboring units on the same phosphodiester chain; thus we introduce a torsion potential Vt and have 

a i 

The potential Vt must have a minimum in zero, and be 27r-periodic in order to take into account the fundamentally 
discrete and quantum nature of the phosphodiester chain. Here we will take the simplest such function |67|. i.e. 
(adding an inessential constant so that the minimum corresponds to zero energy) 

Vt{x) = Kt{l ~ cos(:e)) (III.6) 

where Kt is a dimensional constant. Thus, our choice for torsional interactions will be 

f^* = ^* EE [i - ^o^{(^^tX-(^^t^)] ■ (ni-7) 

a i 

The harmonic approximation for this is of course 



2. Stacking interactions 

Stacking between bases will only depend on the relative displacement of neighboring bases on the same helix in the 
plane orthogonal to the double helix axis [681 . That is, introducing a stacking potential Vs we have 



Us - HHy^i-'r') , (iiii 



where 

'^i"^ := v'(xS-x('^))2 + (yl:\-yl'^))2, (III.9) 

where xf' , iif are the coordinates of the center of mass of the bases. The simplest choice corresponds to a harmonic 
potential 69], Vs = {l/2)Ks(j'^ ■ This will be our choice - which again corresponds to the one made in the PB and in 
the Y models, so that 



We should however express this in terms of the 9 and </? angles. With standard algebra, using Eqs. I|III.3|I . we obtain 

Us = \Ks EaE. 2[R' + r' + 

- R^ cos(0(:\ - 9^) - r^ cos[(0(:\ - 6^^) + (^(^\ - ^1'^))] + 

- Rr (cos[i9[l\ - 0f )) + ^(^\] + cos[(0(:\ - ^f )) - ^\^^]) + ("I-H) 
+ i?r (cos(^(^\)+cos(<^i'^)))^ . 

interactions 

Pairing interactions are due to stretching of the hydrogen bonds linking bases in a pair. Introducing a paring 
potential Vp which models the H bonds, we have 

t/, = ;^y,(^f\^f\^«,^f)). (III.12) 

i 

We note that H bonds are strongly directional, so that they are quickly disrupted once the alignment between pairing 
bases is disrupted. This feature is traditionally disregarded in the Y model, where it is assumed that Vp only depends 
on the distance 



p, := ^J {x^P ~ xf^Y + (yf^ - yf^Y (III.13) 

between the interacting bases; that is. 

Up ^Y. ^p^P'^) ■ (™-i4) 

i 

As noted by Gonzalez and Martin-Landrove [23 in the context of the Yakushevich model, one should be careful in 
expanding a potential Vp{p) in terms of the rotation angles if and 9: indeed, unless po = 0, i.e. a = R + dh-, one would 
get zero quadratic term in such an expansion (see however 24,] for what concerns solitons in this context). 

As for the potential Vp, there are two simple choices for this appearing in the literature. On the one hand, 
Yakushevich |52| suggests to consider a potential harmonic in the intrapair distance p (this would appear nonlinear 
when expressed through rotation angles) and this has been kept in subsequent discussions and extensions of her model 
[56]; on the other hand, Peyrard and Bishop -41] consider a Morse potential; again this has been kept in subsequent 
discussions and extensions of their model 39] . 

There is no doubt that the Morse potential is more justified in physical terms; however, as we wish to compare our 
results with those of the original Y model, we will at first consider a harmonic potential 

VPip) = lKp{p~po)\ (III.15) 

where po is the intrapair distance in the equilibrium configuration. Moreover, again in order to compare our results 
with those of the original Y model, we will later on set po = 0. This corresponds to setting a = R + dh- These 
approximations can appear very crude, but experience gained (as preliminary work for the present investigation) with 
the standard Yakushevich model |24. |25| suggests they do not have a great impact at the level of fully nonlinear 
dynamics. 



We should express Vp in terms of the rotations angles. Using once again the expressions (jIII.3|) . we have with 
standard computations that 
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With this, our choice for the pairing part of the hamiltonian will be 



(III.16) 



(III.17) 



4- Helicoidal interactions 



Helicoidal interaction are mediated by water filaments (Bernal- Fowler filaments [lj|) connecting different nu- 
cleotides; in particular we will consider those being on opposite helices at half-pitch distance, as they are near 
enough in three-dimensional space due to the double helical geometry. As the nucleotide move, the hydrogen bonds 
in these filaments - and those connecting the filaments to the nucleotides - are stretched and thus resist differential 
motions of the two connected nucleotides. 

We will, for the sake of simplicity and also in view of the small energies involved, only consider filaments forming 
between nucleosides; thus only the 9 angles will be involved in these interactions. We have therefore, introducing a 
helicoidal potential Vh and recalling that the pitch of the helix corresponds to f bases in the B-DNA equilibrium 
configuration. 
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(III.18) 



As the angles 9 are involved, the potential Vh should be 27r-periodic |76| . 

Such water filament connections involve a large number (around 10) of hydrogen bonds; hence each of them is only 
slightly stretched, and it makes sense to consider the angular-harmonic approximation 



Vhir) = Kh[l~cosiT)] 



1 



:K,t' 



(III. 19) 



Our choice will therefore be 



Uh 



K, 



i:[= 



,(1) 



3(2)^ 



q(2) 



^P) 



(III.20) 



IV. EQUATIONS OF MOTION 



In the previous sections we have set up the model and the interactions. Let us now study its dynamics. We denote 
collectively the variables as '0°, e.g. with ip = ((p(^\ (/^(^^ 6''^), 0(^)). The dynamics of the model will be described 
by the Euler-Lagrange equations corresponding to the Lagrangian IjIII.ip with the terms in the interaction potential 
given respectively by Eqs. (|nL7^ . (|III.lip . (|IIL17p . (|III.20p 



dL d dL 



= 



(IV.l) 
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With our choices for the different terms of L, and writing a for the complementary chain of the chain a (that is, 1 = 2, 
2 = 1), these read 

mr'^ipf' + mr[Rcos{Lp°' ) + r]!)^ + mrRsm{ipi''){9i''Y = 

-Ky sin(^('^) - ^(^\ + e'f^ ~ 0(^\) + duKpRsini^^t'^ + o'f^ - 9^)+ 
dlK, sin(^|'^) - ^f ) + ef ) - 0f )) + i?(d„i^p + 2K,r) sin[^|"^] ; 

mri?cos(¥>|"^)((^("^ + 2^'f )) + mr^^j"^ + /^'"^ + mr^e[''^ + mR^'f'^ 
-mri? sin(^|'^V!"^ (#' +2^f^) = 

= {Kt + KsR^) ^HOt\ - 0^^) + K^rR sin(¥.|^\ - (9^^^ - 9^\)) (1^.2) 

-Xy sin((^('^) - v^\) + {9^:^^ - 9t\)) - 2aK,Rsm{9i^^y 
2adhKpSm{ip\''^ + ^f )) - KsrRsm[ip[''^ + [91"^ - 9t\)] 
-K^vRsH^t'^ - (^S - (^f)) + {Kt + K^R^) sin(0(;\ - 9^'^) 
+K,r^ sin[(^i:\ - ^^t^) + {9^, - ^l'^))] + K^rRsH^^^, + {9^^ - flf ))] + 

Kpi?2 3ij^(^(a) _ ^(a)^ ^ d,if,i? sin(^('^' + (^f"^) - ^f ))) 



d2ifpSin((^|'^) - ^f )) + (0f ) - 0f ))) - d,ifpi?sin(^f ) - (0f ) - 0f ))) 



+^^.(C 



^_20f)+ 0(^)3 



Note that here a,R,r,dh are considered as independent parameters, i.e. we have not enforced the Yakushevich 
condition R + dh — a (i.e. po = 0). Needless to say, these are far too complex to be analyzed directly, and we will 
need to introduce various kinds of approximation. 

We have thus completely specified the model we are going to study and derived the equations that govern its 
dynamics, i.e. its lagrangian and the equations of motion. The choice of torsion angles as variables to describe our 
dynamics led to involved expressions, but our choices are very simple physically. 

We have considered "angular harmonic" approximations (expansion up to first Fourier mode) i.e. potentials of the 
form V{x) = [1 — cos(a::)] for the torsion and helicoidal interactions, harmonic approximation for the base stacking 
interaction, and a harmonic potential depending on the intrapair distance for the pairing interaction. Our approxima- 
tions are coherent with those considered in the literature when dealing with uniform models of the DNA chain, and in 
particular when dealing with (extensions of) the Yakushevich model. Thus, when comparing the characteristic of our 
model with those of these other models, we are really focusing on the differences arising from considering separately 
the nucleoside and the base within each nucleotide. 

It would of course be possible to consider more realistic expressions for the potentials; but we believe that at the 
present stage this would rather obscure the relevant point here, i.e. the discussion of how such "composite" models 
can retain the remarkable good features of the Y model and at the same time overcome some of the difficulties they 
encounter. Finally, we note that it is quite obvious that the dynamical equations describing the model are - despite 
the simplifying assumptions we made at various stage - too hard to have any hope to obtain a general solution, either 
in the discrete or in the continuum version (see below) of the model. 

In the next section we will focus our attention on the choice of the physical parameters appearing in our model. 
Later, we will investigate the dynamics beginning with the linear approximation and then in the fully nonlinear regime. 

V. PHYSICAL VALUES OF PARAMETERS 

In order to have a well defined model we should still assign concrete values to the parameters - both geometrical 
ones and coupling constants - appearing in our Lagrangian (jIII.l|) and in the equation of motion ()IV.2p . 

A. Kinematical parameters 

Let us start by discussing kinematical parameters; in these we include the geometrical parameters as well as the 
mass m and the moment of inertia /. 

The masses can be readily evaluated by considering the chemical structure of the bases. They can be calculated 
just by knowing masses of the atoms and their multiplicity in the different bases. As for the geometrical parameters 
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like R, a, r and dh (and the moment of inertia /), quite surprisingly different authors seem to provide different 
values for these. Rather than assuming the values given by one or another author, we have preferred to estimate 
the parameters using the available information about the DNA structure. Position of atoms within the bases (which 
of course determine R, a, r and rf/j, and hence /) and geometrical descriptions of DNA are widely available to the 
scientific community in form of PDB files 36]. We will use this information (which we accessed at [ig and |33) to 
estimate directly all static parameters in play on the basis of the atomic positions. 

The geometrical parameters which are relevant for our discussion are the longitudinal width of bases lb and of the 
sugar Is, the distances of the bases from the relative sugars ds and the distance of a base from the relative dual base 
dh- We give our estimates for the masses, moments of inertia and the parameters l,ds,di, for the different bases and 
their mean values in Table Q] ^From those data and using the equations R = Ig, r = dg + lb/2, dh = h + ds, o, = 
Is +lb-\-ds +di,/2 (hats denote mean values) , one obtains the average values for the geometrical parameters appearing 
in our Lagrangian. given in table ITll 





A 


T 


G 


C 


mean 


Sugar 


m 


134 


125 


150 


110 


130 


85 


I 


3.6 X 10^ 


3.0 X 10=* 


4.4 X 10^ 


2.3 X 10^ 


3.3 X 10^ 


1.2 X 10^ 


I 


3.2 


4.0 


5.0 


2.4 


4.7 


3.3 


ds 


1.5 


1.5 


1.5 


1.5 


1.5 


- 


dt 


2.0 


2.0 


2.0 


2.0 


2.0 


- 



TABLE I: Order of magnitude for the basic geometrical parameters of the DNA. Units of measure are: atomic unit for masses 
m, 1.67 X 10~''^Kg • m^ for the inertia momenta I, Angstrom for I, ds and dj,, respectively the longitudinal width of bases and 
their distances from the relative sugars and from the relative dual base. These values have been extracted from the samgle 
"generic" B-DNA PDB data [32l|, kindly provided by the Glactone Project |27| . and double checked with the data from |lq|, 
that agree within 5%. Inertia momenta of bases has been evaluated with respect to rotations about the DNA's symmetry axis 
passing through the sugar's Ci atom the base is attached to; the inertia momentum of the sugar itself has been evaluated with 
respect to rotations about its C3 — C4 axis (see Fig. 



R 


r 


dh 


a 


3.3 A 


3.8 A 


6.2 A 


10.5 A 



TABLE IL Numerical values of the geometrical parameters chracterizing our model 



B. Coupling constants 

The determination of the four coupling constants appearing in our model is more problematic, due partly to the 
difficulties in making experiments to test single coupling constants and partly to the complexity of the system itself. 



Pairing 



The coupling constant Kp, which appears in the pairing potential I|in.l7p can be easily determine by considering 
the typical energy of hydrogen bonds. The pairing interaction involves two (in the A — T case) or three (in the G — C 
case) electrostatic hydrogen bond. The pairing potential can be modelled with a Morse function 



Vp{x) - D{e 



— bd{x,XQ) 



-i2Db'Kp 



Po) 



0{p'). 



(V.l) 



where D is the potential depth, p the distance from the equilibrium position po and h a parameter that defines 
the width of the well. Although throughout this paper we use the harmonic potential (jfff.l7|l to model the pairing 
interaction, the use of the Morse function seems more appropriate for evaluating the parameter Kp. The point is 
that the pairing coupling constant is physically determined by the behavior of the pairing potential away from its 
minimum. Using the harmonic approximation (|fff.l7p for estimate Kp would result in a completely unphysical value 
for the parameter. 
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Different estimates of tfie parameters appearing in tfie potential HV.ljl are present in the literature. The estimates 

Dat = O.OSOeV, Dgc = 0.045eV, Iat = 1.9A~\ hoc = 2.5A"^ 
are given in [l^ and used in |63| ■ The values 

D = 0.040eV, b = 4.45A"^ 
are given in [ij and used in y, [lai US • Finally, the estimates 

Dat = 0.050eV, Dgc = 0.075eV, Iat = bcc = 4A^^ 

are given in Q and used in [3, yj ■ The values of coupling constants corresponding to these different values for the 
parameters appearing in the Morse potential range across a whole order of magnitude: 

3.5 N/m < Kp := 2b^D < 38N/m . (V.2) 



In our numerical investigations we will use a value of Kp near to the lower bound given in IV. 21 that is, we adopt 
the value Kp = 4N/m, leading to an optical frequency of wq — y^^Kp/m ~ 36cm~^, so to be in agreement with ^4^ . 



2. Staa 

The determination of the torsion and stacking coupling constants is more involved and rests on a smaller amount 
of experimental data. The main information is the total torsional rigidity of the DNA chain C = SS, where S — 3.4A 
is the base-pair spacing and S is the torsional rigidity. It is known j6, 7] that 

10-2^ Jm < C < 4-10-2^Jm. (V.3) 

This information is used e.g. in [I^jISJ, whose estimate is based on the evaluation of the free energy of superhelical 
winding; this fixes the range for the total torsional energy to be 

180KJ/mol < 5 < 720KJ/mol. (V.4) 

In our composite model the total torsional energy of the DNA chain has to be considered as the sum of two parts, 
the base stacking energy and the torsional energy of the sugar-phosphate backbone. In order to extract the stacking 
coupling constant we use the further information that tt — tt stacking bonds amount at the most to 50KJ/mol |3nl |. 
Assuming a quadratic stacking potential, as we do, and a width of the potential well of about 2A we obtain the 
estimate K^ = 68N/m. The phonon speed induced by this is ci= S^/WJJm ~ 6Km/s, see eq. (|VL12|I : this is rather 
close to the the estimate of 1.8Km/s < ci < 3.5Km/s given in ^57]. 

As we shall see in detail in Sect. IIXI choosing smaller values for Kg would have non-trivial consequences since 
solitons with small topological numbers become unstable in the discrete setting when the ratios Ks/Kp and Kt/Kp 
get small enough (see sect. IIX|) . In particular, this value for Kg - together with the Kt below - is barely enough to 
allow the existence of solitons, as discussed later on this paper. 

3. Torsion and helicoidal couplings 

After extracting the stacking component, our estimate for the torsional coupling constant Kt is in the range 

130KJ/mol < Kt < 670KJ/mol. (V.5) 



Assuming (see below) that K/^ ~ Kt/25, so that C4 — y/2Kt/Is (see eqs. (|VI.ll|l and (|VI.12|l V all of these values for 
Kt induce phonon speeds slightly higher with respect to the estimates cited above, between 5 Km/s and 11 Km/s. 
For our numerical investigations, to keep the phonon speed as low as possible, we will set Kt — 130KJ/mol. 

Finally, for the helicoidal coupling constant, following [23, we assume that Kt and Kh differ by about a factor 25, 
so that Kh = 5KJ/mol. 
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4- Discussion 



It is interesting to point out how the geometry of the model nicely fits with the estimates of the binding energies 
so to induce optical frequencies and phonon speeds of the right order of magnitude (see also the discussion in Sect. 
IVI|) . This is not the case in simpler models, where in order to get the right phonon speed within a simple Y model 
one is obliged to assume for Kt the unphysical value Kt = 6000KJ/mol [57| . 

Our estimates, and hence our choices for the values of the coupling constants appearing in our model, are summarized 
in table IIIII We will use these values of the physical parameters of DNA in the next sections, when discussing both 
the linear approximation and the dispersion relations as well as the full nonlinear regime and the solitonic solutions. 



Kt 


Ks 


AV 


A-,, 


130 KJ/mol 


68N/m 


3.5 N/m 


5 KJ/mol 



TABLE III: Values of the coupling constants for our DNA model 



VI. SMALL AMPLITUDE EXCITATIONS AND DISPERSION RELATIONS 

In this section we will investigate the dynamical behavior of our model for small excitations in the linear regime. 
We will enforce the Yakushevich condition R + dh = a in order to keep the calculations and their results as simple as 
possible (see also 0). 

Linearizing the equation of motion HIV.2() around the equilibrium configuration 



"P^. 



(a) 



g(a) 



^(") = 0f ) = 0, 



(VI.l) 



we get using standard algebra. 



mr^ipi + m{rR 



-Kr, 






l(a) 



(a - i?)2(^("' + ^f ) + a[a - R){B\^^ + (^) 



m{rR + r2)^f ^ + (/ + m{R + r)'^)el°-^ 



= Kt 



g(a) 



29. 



(a) 



g(a) 



q(a) 



g(a) 



+Ks{r + R) {R + r)ie\l>, - 29^ + 0^1) + K^.^^i - ^^T' + vT-^i) 



(a) 



M 



Aa) 



-Kr, 



So.) 



{&)\ 



(a^-aR){^>+^>)+a^(9r+6r) 



-,2(n{o.) I /J(a)^ 



+ ^/.(^l+'5 



29^^+9f^^ 



Aa) 



(VI.2) 



We are mainly interested in the dispersion relations for the propagating waves, which are solution of the system (|VI.2|I 
To derive them it is convenient to introduce variables (/s^^' and 6'^^' defined as 



± (1) I (2) 

ip^ ^^\' ± 'Pi 



g± - fl(i) 



ey> ± e. 



(2) 



(VI.3) 



Let us now Fourier transform our variables, i.e. set 



'-(t) = Gf^ e^p[i{kSn + cot)] 



(VI.4) 



Here k is the spatial wave number, lu is the wave frequency, and (5 is a parameter with dimension of length and set 
equal to the interpair distance {S = 3.4 A), introduced so that k has dimension [L]^^ and the physical wavelength is 
A = 27r/fc. In this way, we should only consider k G [— tt/J, tt/J]. 

Using (|VI.3p and (|VI.4p into (|VI.2p . we get a set of linear equations for (^^,G^^); each set of coefficients with 
indices (A:,a;) decouples from other wave number and frequency coefficients, i.e. we have a set of four dimensional 
systems depending on the two continuous parameters k and lu. This is better rewritten in vector notation as 



MC 



ku) 



0, 



(VI.5) 
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where (^kuj is the vector of components 



Cki. 



— K^kuj 



^ku ' ^kcj 



Gl 



(VI.6) 



and M is a four by four matrix which we omit to write exphcitly. 

In order to simpHfy the calculations we will set to zero the radius of the disk modelling the base, i.e dh = r. As in 
our model the disk describing the base cannot rotate around its axis, this assumption does not modify the physical 
outcome of the calculations. The condition for the existence of a solution to l|VI.5(l is the vanishing of the determinant 
of M. By explicit computation the latter is written as the product of three terms, apart from a constant factor r^, 



\M\\ — r TTl 7r2 TTs 



(VI.7) 



The three factors being, 








TTl 


= 


~2Ks 


+ mw^ + 2Ks cos{Sk) , 






7r2 


— 


ILU^- 


2{{Kh + Kt) ~ Kt cos[k6] + Kh cos[5/c(5]) , 




71-3 


= 


{imr^ 


)uo^-2r^ {Kpl + 2{KJ + Ktm) : 


sm^{kS/2)+ 






+ 


2mKh 


,si-a^{5k6/2))uj^ + Sr^ {Kp + 2K, 


sm^ikS/2)) {Kh 


sin2(5fc(5/2) 


The equation 


\\M 


= has four solutions, given by 







Ktiim^(kS/2)) 



-I 



4{Ks/m) sm^{kS/2) , 

4:{Kt/I)sm^{k6/2) + 2{Kh/I)[l + cos{5kS)] 

2{Kp/m) + A{Ks/m) sm^{kS/2) , 

4.{Kt/I) sm^{kS/2) + A{KhlI) iiin^{bk5/2) . 



(VI.. 



(VI.9) 



Eqs. IJVI.9|I provide the dispersion relations for our model. Physically, the four dispersion relations correspond to 
the four oscillation modes of the system in the linear regime. The relation involving wi describes relative oscillations 
of the two bases in the chain with respect to the neighboring bases. As uji{k) -^ for k ^ there is no threshold for 
the generation of these phonon mode excitations. 

The relations involving 0^2 and lu^ are associated with torsional oscillations of the backbone. In case of UJ2 there is 
a threshold for the generation of the excitation originating in the helicoidal interaction, whereas the second torsional 
mode UJ4 has no threshold and is thus also of acoustical type. The dispersion relation involving UJ3 describes relative 
oscillations of two bases in a pair. The threshold for the generation of the excitation is now determined by the pairing 
interaction. The dispersion relations l|VI.9(l are plotted as uj/{2nc), where c is the speed of light (we use the, in 
the literature widespread, convention of measuring frequencies in 2ttc units) versus kA/2 in Fig.|21for values of the 
physical parameters given in the tables Q] and IIIII The four dispersion relations take a simple form if we consider 
excitations with wavelength A much bigger then the intrapair distance, i.e A >> S; this corresponds to the (5 — > 
limit. We have then 



2 2/2 2 

Lu^ - c^k = q^ 



(VI.IO) 



where Cq and g^ (a = 1 . . .4) are, respectively, the velocity of propagation (in the limit k >> qa) and the excitation 
threshold. They are given by 



Cl 



= Sy^Ks/m, 



91=0; 



C2 = 5^{Kt - 25Kh)/I, 92 = 2/?W^; 



C3 = Sy^Ks/m, 



(73 == y/2Kp/m; 



(VI. 11) 



(VI. 12) 



C4 = S^iKt + 25Kh)/I, qA = 0. 
Using the values of the parameters given in the tables HI and IIIII we have 

Cl — 6.1 Km/s, (71 = ; 

C2 = OKm/s, (72 = 22 cm^^ ; 

C3 = 6.1 Km/s, (73 = 36cm~^ ; 

C4 — 5.1 Km/s, (74 = , 

where C2 = comes from the fact that we are taking Kt ~ 2bKh (see table llTT)l . This of course just means that C2 
is at least an order of magnitude smaller than the other Ci - and therefore negligible. Speeds can be converted to 
base per seconds by dividing each Ci hy 5 = 3.4A; excitation thresholds can be converted in inverse of seconds by 
multiplying each qi by 27rc, where c is the speed of light. 
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FIG. 3: Graph of the dispersion relations IIVl.911 in the first Brillouin zone. We plot LjQ,/(27rc) (c is the speed of light) as a 
function of fcA/2. lji, W2,tJ3,tJ4 are represented respectively by the thin continuous, thick continuous, thick dashed and thin 
dashed line. Units are cmT^ in the vertical axis and radiants in the horizontal axis. 

VII. NONLINEAR DYNAMICS AND TRAVELLING WAVES 

After studying the small amplitude dynamics of our model, we should now investigate the fully nonlinear dynamics. 
We are in particular interested in soliton solutions, and on physical basis they should have - if the model has any 
relation with real DNA - a size of about twenty base pairs. This also means that such solutions vary smoothly on 
the length scale of the discrete chain, and we can pass to the continuum approximation. On the other hand, such a 
smooth variance assumption is not justified on the length scales (five base pairs) involved in the helicoidal interaction, 
and one should introduce nonlocal operators in order to take into account helicoidal interactions in the continuum 
approximation '2(l|. 

Luckily, numerical experiments show that - at least in the case of the original Yakushevich model - soliton solutions 
are very little affected by the presence or of the helicoidal terms (as could also be expected by their intrinsical smallness, 
in a context where they cannot play a qualitative role as for small amplitude dynamics) [26j| . Thus, we will from now 
on simply drop the helicoidal terms, i.e. set K^ — 0. 



A. Continuum approximation and field equations 

The continuum description of the discrete model we are considering requires to introduce fields $'°)(z, i), 0'^'^)(z, t) 
such that 



The continuum approximation we wish to consider is the one where we take 



e(a;±(5,i) 



$(x,t) ± b^^ix.t) + (l/2)52$,,(x,t) , 
Q{x,t) ± 6Q,{x,t) + (l/2)(52e,,(a;,i) . 



(VIl.l) 



(V11.2) 



Inserting (|VII.2p . and taking K^ = 0, into the Euler-Lagrange equations ljlV.2|) . we obtain a set of nonlinear coupled 
PDEs for '^'^°-\x,t) and 6^'^^(x,t), depending on the parameter 5. Coherently with (|VII.2p . we expand these equations 
to second order in 5 and drop higher order terms. The equations we obtain in this way are symmetric in the chain 
exchange. We will be able to decompose their solutions into a symmetric and an antisymmetric part under the same 
exchange. In view of the considerable complication of the system, it is convenient to deal directly with the equations 
for these symmetric and antisymmetric part and to enforce the Y contact approximation R + dh — a 

That is we will consider fields 



<J,± = $(1) ± $(2) ^ 0±^ 0(1) ±0(2) ^ 

and discuss equations which result by setting two of these to zero. In the symmetric case, i.e. for 

Q^^\x,t) = e(2)(x,i) = e(x,i) , $(i'(a;,i) = $''^(x,i) = $(a;,i) , 



(VI1.3) 



(V11.4) 
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the resulting equations are 



(VII.7) 



mr^^tt + (mr^ +TOi?r cos <i>)6ft = -2a{a - R)KpSm{^ + Q) + 

~R{2Kp{R -a) + mrQ'^t) sin $+ 

+(52 Ksr [r($:rx + Oxx) + RQxx cos $ + i?e2 sin $] ; 
(mr^ +TOi?r cos $)$4t + (/ + TO(i?2 +r2 + 2i?r cos $))eit = (VII.5) 

= -2aKp{Rsi\iQ + (a - i?) sin($ + 9)) + m$fi?r($f +2et)sin$+ 

+52 [if^r2$,, + (i^t + K,{R^ + r2))e,,+ 

+i^,i?r (($^^ + 29:,:,) cos $ - $:,($:, + 26:,) sin $)] . 

In the antisymmetric case 

<d^^\x,t) = -<d^'^\x,t) = Q{x,t) , $(i)(a;,t) = -<^^'^\x,t) = ^{x,t) , (VII.6) 

the resulting equations are 

mr^^tt + {mr^ + mRr cos $)6ft = 

= Kp{a - R) [R sin($ + 26) + {a - R) sin(2($ + 6)) - 2a sin($ + 6)) + 

+R{Kp{a - R)- mrQ^) sin $+ 

+^2^;^ - sr [r($,, + 6:,:,) + i?cos($)e,, + i?sin($)e2] ; 
{mr^ + mRr cos ^)^tt + {I + m{R'^ + r^ + 2Rr cos $))6t( = 

= Ji:p(-2ai?sine + i?2sin(2e)+ 

+ (a - i?)(-2asin($ + 6) + (a - i?) sin(2($ + 6)) + 2i?sin($ + 26))) + 

+m^trR{^t + 26t) sin $+ 

+52 [K,r2$,, + KtQxx + Ks{R^ + r^)e,,+ 

+KsrR{{^^^ + 26:,:,) cos $ - $:,($:, + 26:,) siu $)] . 

We will not write the equations in the cases of mixed symmetry, i.e. for 6'^^^(a;,i) — Q'^'^\x,t) ~ Q{x,t), <i>'^^^(x,t) = 
-«>(2)(a;,t) = ^{x,t) and for 6(i)(x,t) = -6(2)(x,t) = e{x,t) and $(i)(2;,t) = ^^^^x,t) = $(a;,t). 

B. Soliton equations 

When studying DNA models, one is specially interested in travelling wave solutions, i.e. solutions depending only 
on 2: := X — vt with fixed speed v: 

$(") {x,t)= ip^"'^ {x ~vt):^ if'^"^ [z) , 6^'^) (x, i) == 1?^") [x -vt):= d^"'^ [z) . (VII.8) 

If we insert the ansatz (jVII.Sp into the equations (jVII.Sp and ljVII.7p . we get a set of four coupled second order 
ODEs; defining 

^ := (mv2 _ KsS"^) , J := {Iv^ - KtS^) , (VII.9) 

in the completely symmetric case (jVII.5|l we obtain 

^r^if" + iir{r + R cos If) d" — 

= -2aKp{a - R) sin {ip + d) + KsS^Rr sin (^) {d'f + 

-R sm{ip) (-2i^p(a - R) + mrv'^{i)'f) ; (VII.IO) 

^r{r + R cos if) if" + [J + pi{R'^ + r'^ + 2Rr cos (p) ■&" = 

= -2aKp {Rsini} + {a - R) sin((/3 + i}j) + fiRr sm{Lp)[{ip')^ + 2ip'd'] . 
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In the completely antisymmetric case (|VIL7|I . instead, we get 

lir'^ ip" + ^r{r + R cos If) -d" = 
= -2Kp{a - R){a - Rcos-d - (a ~ R) cos{ip + -&)) sm{ip + iS) + 
-fiRr {sin ip){'d')^ ; 
fir{r + R cos >f)>f" + [J + fiiR"^ + r^ + 2Rr cos f) d" = (VII.ll) 

= -Kp [2aRsm^ - R^ sin(2i9)+ 
+ (a - R) (2asin((^ + d) - {a - R) sin{2{Lp + d)) - 2Rsm{ip + 2d))] + 
+/iri?(sinv3)[((y9')^ + V^'] • 

The previous equations appear too involved to be studied analytically at least in the general case. Numerical results 
are discussed in Sect. IIXI below. Some understanding can be gained at the analytical level by considering a particular 
case of the full equations (|VII.10p . (jVII.ll|l . when the system reduces essentially to the Y case. The next section is 
devoted to this. 

C. Boundary conditions 

We have so far just discussed the field equations l|VII.5(l and (|VIL7p and their reductions; however these PDEs make 
sense only once we specify the function space to which their solutions are required to belong. The natural physical 
condition is that of finite energy; we now briefly discuss what it means in terms of our equations and the boundary 
conditions it imposes on their solutions. 

The field equations l|VII.5p . (|VII.7|I are Euler-Lagrange equations for the Lagrangian obtained as continuum limit 
of (|III.1|I . In the present case, the finite energy condition corresponds to requiring that for large \z\ the kinetic energy 
vanishes and the configuration correspond to points of minimum for the potential energy. The condition on kinetic 
energy yields 

$t(±oo, i) = , et{±oo, t) = , (VII.12) 

where of course <I>t(±oo,i) stands for hm^^ioo ^t{z,t), and so for Qf 

As for the condition involving potentials, by the explicit expression of our potentials, see above, this means (with 
the same shorthand notation as above) 

$(±oo, t)^0 , e(±oo, t) = 2n±TT , ,^jj ^g, 

$^(±cx),i) = , e^(±oo,t) = ; 

Let us now consider the reduction to travelling waves, i.e. eqs. (|VIL10|I and (|VII.11|I . In this framework, conditions 
(|VIL12p and (jVII.lSp imply we have to require the limit behavior described by 

V3(±oo) = , ^^(ioo) == 2n±7r , .^^^ ^^. 

ip'{±oo) = , ^^'(ioo) = , 

for the functions f{£,), i^iS,)- 

We would like to stress that eqs. (|VII.10(I and HVII.11|I can also be seen as describing the motion (in the fictitious 
time ^) of point masses of coordinates {){£,), (p{£,) in an effective potential; such a motion can satisfy the boundary 
conditions ljVII.14|) only if {ip, d) = (0, 27Tk) is a point of maximum for the effective potential. This would provide the 
condition fj, < and hence a maximal speed for soliton propagation (as also happens for the standard Y model); we 
will not discuss this point here, as it is no variation with the standard Y case, and the condition /i < is satisfied 
with the values of parameters obtained and discussed in Sect. Ivl 

The solutions satisfying (|VII.14p can be classified by the winding number n :— n+ — n_. Needless to say, here we 
considered the equations describing symmetric or antisymmetric solutions, but a similar discussion also applies to the 
full equations, i.e. those in which we have not selected any symmetry of the solutions; in this case we would have 
two winding numbers, which we can associated either to i?'^\ ^^^^ or directly to their symmetric and antisymmetric 
combinations ^9^^-' ± 'd'-^K 
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VIII. COMPARISON WITH THE YAKUSHEVICH MODEL 



The standard Y model [SjI can be seen as a particular limiting case of our model. Thus, a check of the validity of 
our model - and in particular of the fact we considered here physical assumptions which correspond to those by Y 
in her geometry - can be obtained by going to a limit in which our composite Y model reduces to the standard Y 
model. The latter can be obtained as a limiting case of our composite model in two conceptually different ways. 

• A first possibility, which we call parametric, is to choose the geometrical parameters of the model so that its 
geometry actually reduces to that of the standard Y model. 

• A second possibility, which we call dynamical is to force the dynamics of our model by setting ip"' — 0, i.e. by 
freezing the non-topological angles and constraining them to be zero. 

Let us briefly discuss these in some more detail. The parametric way to recover the standard Y model from our 
model consists in setting to zero the radius of the disks modelling the bases, and at the same time pushing it on the 
disk representing the nucleoside on the DNA chain. In this way the base corresponds to a point on the circle bounding 
the disk representing the nucleoside. Note that this would cause a change in the interbase equilibrium distance, unless 
we at the same time also change the radius of the disk representing the nucleoside. 

This limiting procedure corresponds - recalling we also want to recover the Y approximation of zero interbase 
distance - to the following choice of the parameters: 



Ks = dh = r = , a = R 



(VIII. 1) 



After the base has been pushed on the disk, its mass enters to be part of the disk's mass - hence contributes to its 
moment of inertia - and we can thus just take m = Q. Similarly, as the bases have lost their identity and are enclosed 
in the disk modelling the whole nucleotide, the effective stacking interaction has to be physically identified with the 
torsional interaction of the disk now modelling the entire nucleotide. For this reason in our equations we will take 
Kg = Q and Kt -^ Ks- 

Use of equations (jVIII.l|l into the equations of motion (|IV.2|I yields 
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KssHo'tX-e.^ 
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K^ sin( 



KpR^ sin( 
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q(a) 



(VIII.2) 



The previous equations represents the equations of motions for the Y model. Some care has to be used when the 
values of the parameters given by Eq. IjVIII.ip correspond to singular points of the equations. This is for instance 
the case of the dispersion relations wi,W3 in Eqs. (jVI.9|l . which are singular for to = 0. The dispersion relations for 
the Y model can be easily found by linearizing the system (|VIII.2p . One finds two dispersion relations; one is given 
by u)2 of the composite model, see Eqs. (jVI.9p : the other is 



^ 2R — + ^-sm (y) + -psm ( — ) 



(VIII.3) 



To obtain the Y model dynamically from our model, we set <I> = into the continuum equations IjVII.Sp and ljVII.7p . 
We also enforce the Y condition R + d^ = a and work in the zero radius approximation for the disk modelling the 
base, i.e we set r — df^. In the fully symmetric case we get from Eqs. (jVII.Sp 



mQf 



-2KpSin{Q) + S^KsQ^^ 






TO e. 



-2Kpiiine 



Kt 



(r+fl)2 



Ks 



e. 



Compatibility of the previous two equations requires that 

IQu = S^KtO,, . 
In the case of travelling wave solutions Q{x, t) = z9(a; — vt) the constraint IjVIII.Sp reads 

v^ = —Kt. 

I 



(VIII.4) 



(VIII.5) 



(VIII.6) 



We take from now on the positive determination of velocity for ease of discussion. Using the Eqs. IJVII.8|) . (|VII.9p 
and ljVIII.6|l . Eqs. IJVIII.4|I yields the travelling wave equation. 



d" =-2{Kp/no) sini? 



(VIII.7) 
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where 

fio = {mKt - IKs) J . (VIIL8) 

With the usual boundary conditions 'i9'(±oo) = 0, t?(oo) = 27r, ??(— oo) = 0, Eq. IjVIII.Tp has a solution for hq < 0, 
given precisely by the (1,0) Yakushevich soliton 



^ = 4 arctan [e"'"^] , K=^2Kp\iio\- (VIIL9) 

We have thus recovered for the topological angles - imposing the vanishing of non-topological angles as an external 
constraint - the Y solitons. The condition for the existence of the soliton, fiQ < 0, implies that the physical parameters 
of our model must satisfy the condition 

^ < ^ . (VIII. 10) 

1 m 

With the parameter values given in the tables HI and IIIII we have 

mKt 



IK., 



0.3 < 1 ; (VIII.ll) 



hence (jVIII.lOp is satisfied and we are in the region of existence of the soliton. 

Let us now consider the antisymmetric equations. Using ip — Q into the system (|VII.ll|l and the compatibility 
equation ljVIII.6|l we get 

d" = -2{Kp/fio) (1 -cosi?) sin?? . (VIII.12) 

As expected this equation, with the usual boundary conditions (see above), admits a solution for /iq < 0, and the 
solution is in this case given by (0, 1) Yakushevich soliton 

•d = -2arccot[-Kz] (VIII.13) 

(with K as above). The allowed values of the physical parameters are determined by the same arguments used for the 
(1,0) soliton. 

It should be stressed that in the standard Y model the travelling waves speed is essentially a free parameter, 
provided the speed is lower than a limiting speed zZ, z3]. Here, recovering the standard Y model as a limiting case of 
the composite model produces a selection of the soliton speed given by Eq. IJVIII.6P : this makes quite sense physically, 
as it coincides with the speed of long waves determined by the dispersion relations l|VI.ll(l . 

IX. NUMERICAL ANALYSIS OF SOLITON EQUATIONS AND SOLITON SOLUTIONS 

Even after the several simplifying assumptions we made for our DNA model, the complete equations of motions 
given by Eqs. (jVII.5(l and 1)VII.7|I respectively for symmetric and antisymmetric configurations, are too complex to 
be solved analytically; the same apphes to the reduced equations HVII.lOp . (jVILlip describing soliton solutions. We 
will thus look for solutions, and in particular for the soliton solutions we are interested in, numerically. 

In order to determine the profile of the soliton solutions we will analyze the stationary case, with zero speed and 
kinetic energy, and apply the "conjugate gradients" algorithm (see e.g. |2Ml35J ) to evaluate numerically the minima 
of our Hamiltonian 

This approach also allows a direct comparison with the results obtained for the standard Yakushevich model, and 
shown in 57], where authors proceed in the same way and by means of the same numerical algorithm; this again 
with the aim, as remarked several times above, to emphasize the differences which are due purely to the different 
geometry of our "composite" model. With the same motivation, we have also checked our numerical routines by 
applying them to the standard Y model; in doing this we have also considered with some care ~ and fully confirmed 
- certain nontrivial effects mentioned in l57l. 
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A. Solitons in the standard Yakushevich model 

As mentioned above, we will first present the numerical investigation of the stationary solitons of Yakushevich 
model. Although the soliton solutions of the Yakushevich model have already been the subject of previous numerical 
investigations |57j. it is useful to repeat the analysis here in order to check our algorithm (and confirm the results 
reported by |57l|). 

Moreover, in the next section we will compare the soliton solutions of our composite model with those of the 
Yakushevich model. We will therefore need an explicit numerical results for the Yakushevich model obtained using 
our code. 

The homogeneous Yakushevich Hamiltonian for static solutions, i.e. setting the kinetic term T to zero, is 

= ^Y.n [^"V"^ + A„X^] + 9 Ln [1 - COS Anlp COS A„x] + 2^^ I]n [l " ^ COS Ipn COS Xn + COS^ X«] • 

Here / is the inertia momentum; K and g respectively the pairing and torsional coupling constants of the Yakushevich 
model; O'-^'^' are the angles describing the sugar rotation with respect to the backbone. Moreover, we write ip = 6^ , 
X — 0^ (the 6 are defined as in Eq. (jVLSp l. and An?/) := ipn+i — ipn and similarly for x- 

If we select the physical value for / (given in tableP) and factor out the K (equivalently, we measure energies in units 
of K; this takes the value K = 150KJ/mol), then the only independent parameter left in H is the coupling constant 
g. This can and will be used, as in |57|, for a parametric study of the solutions. In our numerical investigations we 
used (in order to avoid any accidental spurious effect) two independent implementations of the conjugate-gradient 
algorithms, i.e. the one developed by Numerical Recipes [33 and the one provided by the GNU [23. The results 
obtained with the two implementations turned out to be in very good agreement. 

The conjugate-gradient algorithms requires a "starting point" , i.e. an initial approximation of the minimizing 
configuration; in the case of multiple minima, the algorithm will actually determine a local minimum or the other 
depending on this initial approximation. Following [57| . we have used as starting points hyperbolic tangent profiles 

6»i^2 ^ ^1,2 ^(--^ _^ tanh(/3(2n - N))) , (IX.2) 

where {q^, q^) is the topological type of the soliton, N is the number of sites in the chain, and j3 a parameter, whose 
reasonable range is about < /3 < A^/10, used to adjust the profile. The parameter (3 is crucial to determine the 
structure of the minima (hence of the solitonic solutions) of the Yakushevich Hamiltonian. Choosing different ranges 
of variation for (3 we are probing different dynamical regions of the system, where different local minima of the 
Hamiltonian may be present. 

In the case of the "elementary" solitons - the (1,0) and the (0, 1) ones - for which only one degree of freedom 
matters, we obtain the same local minimum, up to 10~^ in the energy and 10~^ in the angles, independently of (3 
(provided of course that (3 is not too close to zero; in our case it suffices to keep /3 > 4) in agreement with the fact 
that in this case the minimum is known to be global. 

1. Quasi-degeneration of the energy minimizing configurations for higher topological numbers 

The situation appears to be different in the case of the (1, 1) soliton. Here in facts numerical investigations show 
a strong dependence on (3 of the local minimum determined by the algorithm for most of its range, in particular 
for /3 > 6, while energies vary very little - within 0.2% - about 49. 3K. This suggests we are in the presence of a 
rather complex structure of the phase space for the (1,1) limiting conditions. There still is however a small interval, 
(4 < /3 < 6), where the algorithm behaves exactly as in the (1,0) and (0,1) cases, i.e. yields almost the same 
energy minimizing configuration as f3 is varied, and allows us to find what we believe is the global minimum of the 
Hamiltonian. 

We have detected this same behavior also in solitons of higher topological types, e.g. (2, 0) and (2, 1). This suggests 
we are in the presence of a generic pattern |7lj| ; we point out that the reason why this same behavior is not shared 
by the solitons of types (1, 0) and (0, 1) is related to the fact that in these two cases (and only in them) the problem 
reduces actually to a one-dimensional dynamics, while all others cases are intrinsically two-dimensional. 

2. Numerical instability of the solitons 

A noteworthy fact pointed out in [531 is that the discrete version of the soliton solutions lose stability, namely switch 
from minima to saddle points, as g gets close to 0, a phenomenon that is not shared by its continuos counterpart. We 
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FIG. 4: Static solitons profiles for the Yakushevidi homogeneous Hamiltonian. Thicker lines correspond to <; = 150, thinner 
ones to p = 23.4; dashed ones to the angle 6^ and continuous ones to the angle 9^ . Topological numbers are as follows. Upper 
left (a): (1,0); lower left (b): (0,1); upper right (c): (1,1); lower right (d): (1,1). Picture (d) is obtained by a computation 
where 2ti has been approximated with 6.28, and shows a sensitive dependence of the solution on the boundary conditions. The 
energies Ef^^^^ of the solitons are: b"'"^ = 153.4 iC, s'^"^ = 62.93 7^, s'^^ = 62.93 7^, E^'^^^ = 59.32 /C, s'^'j'j = 24.63 A", 

23.4 

E = 24.63 K". Energies are measured in units oi K = 150KJ/mol. 
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FIG. 5: Static solitons profiles for the Yakushevich homogeneous hamiltonian expressed in the coordinates [ip, x)- Thicker lines 
correspond to g = 150, thinner ones to p = 23.4; dashed ones to the angle tj) and continuous ones to the angle x- Topological 

150 150 

numbers are: (1,0) in (a), (0,1) in (b), (1,1) in (c). The energies S^'^ ^^ of the solitons are: E^^ ^^ = 62.93 7^, S^^ ^^ = 153.4 if, 
-^a 0) = 195-3^, £(''1* = 24.64 if, b'^ * = 59.34 if, E^^H^ = 75.56 iiT. Energies are measured in units of if = 150KJ/mol. 



have looked for this effect and confirmed the findings of |57| . In our numerical computations the g values at which 
the transition takes place turned out to be different if computations are performed in the 9^'^ coordinates or in the 
(^,x) ones. Transition values corresponding to the onset of the numerical instability are given in Table Hvl 

The transition values are, in the {O^'"^) coordinates, g = 14.7 for the (1, 1) soliton and g — 7.05 for both the (1, 0) 
and (0, 1) ones; in the (?A, x) coordinates they are g — 7 for the (1, 1) soliton, g — 16.2 for the (0, 1) one and g — 14.7 
for the (1, 0) one. When the instability sets in, what we observe is that the soliton - i.e. the discrete configuration 
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(1,0) 


(0,1) 


(1,1) 


(V',x) 


7.05 
14.7 


7.05 
16.2 


14.7 
7.0 



TABLE IV: The transition values of go for instability (arising for g < go) of the (p, g) solitons. 



smoothly interpolating between the boundary values - breaks down and we have instead a configuration in which all 
angles are equal to the left boundary value for n < uq, and to the right boundary value for n > uq. 

In other words, the transition between the two boundary value does not take place over a (more and less extended) 
range of sites, but abruptly at a given site - which we denoted above as uq, but of course can be any site. This 
also shows that in this case we have a strong degeneration of the Hamiltonian also in the finite length case (for 
infinite chains, this is always the case as the Hamiltonian is invariant under translations), which will show up in a 
computational instability for the numerical energy minimization. 

In Fig. 21 we show the profiles of the Yakushevich solitons wc have obtained for g = 23.4 (this is the value 
corresponding to the coupling constants in use in our model, see Eq. (|IX.8(I below) and g ~ 150 (this corresponds to 
the coupling constant value chosen in |53). The profiles we obtain are qualitatively identical to the inhomogeneous 
ones presented in l57|. Incidentally, we noticed a peculiarly strong dependence on the initial conditions for the (1, 1) 
mode, so that those profiles change considerably depending on the approximation chosen for 27r: e.g., in Fig.^ we 
used an approximation extremely precise while in Fig.^ we used 27r — 6.28. We believe that the first one represents 
the correct solution, e.g. also because it is the only one of the two that respects the symmetry of the equations. 
We also produced profiles for the solitons of the Yakushevich Hamiltonian with respect to the angles ip, x (which 
are the coordinates we use in our Hamiltonian). The results are show in Fig. in these new coordinates no strong 
dependence on the initial conditions was spotted. 



B. Solitons in the composite model 



Let us now turn to the numerical investigation of our model. We will consider the case when the intrapair distance 
at the equilibrium is zero, i.e we will set a = r + d^ (contact approximation). Notice that we are not considering 
the zero-radius approximation for the bases, so that in general r ^ df^. The Hamiltonian of the system can be easily 
derived from the Lagrangian 1)111.1(1 and is given by 

H = TB+n + Vt + Vs + Vp + Vh + V^ . (IX.3) 



We use the shorthand notation 



A„-0 = -0„+i 



6i = 'Pn , nn 



Vri 



and similarly for the other variables; we also write 

a = R/r , (3 = R/dh ; 
with the values given in table QJ it results a = 0.92, f3 = 0.53. With these notations, we have 



+ A„x 

,2 



Tb= E„^s(An^ 

Tb = En hi^nC^ + AnV^ + A^i^^ + A„x^ + 2A„7/.A„C + 2A„xA„r/ + a2(A„V 

+2a( A„ V^ + A„x^ + A„ VA„C + A„xA„77) cos 5„ cos r?„ 

+2q(2A„VA„x + A„xA„C + A„VA„?7) sin^„ sin?7„] 
Vt = 2KtJ2n [cos A„ V cos A„x - 1] 
Vs = ^K^r^ J2n 4[1 + a^ -a^ cos(A„V) cos(A„x) - cos(A„x + A„?7) cos(A„V + A„0 

+2acos((S„C - S„??)/2) sin((A„V' - A„x)/2) sin((A„V - A„x + A„e - A„r7)/2) 

+2acos((S„^ + S„T])/2) sin((A„V + A„x)/2) sin((A„V + A„x + A„e + A„r7)/2)] 
Vp = ^Kpdl En 4[(1 + /3)^ + cos2(x„ + r]n) + 2/3 cos x™ cos^„ cos(x„ + Vn) + P^ cos^ Xn 

-2/3(1 + /3) COsV-n COSXn - 2(1 + (3) C0S(V'„ + Cn) COs(Xn + Vn)] 

Vh = Kh I]„[2 - cos(i/'„+5 - Xn) - cos(x„+5 - V'™)] 
Ku = K^^ I]„[tanh(^„ + ?7„) + tanh(^„ - ?]„)] 
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FIG. 6: Region of instability (black region) for the discrete solitons of topological type (0, 1). in the {gt,gs) plane. 
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TABLE V; Values of the typical energies characterizing the different interactions in the Hamiltonian of Ea. liX.4l 



Note that we have inserted in the Hamiltonian the confining potential Vw in order to implement dynamically 
the constraint ljIII.2(l for the non-topological angles (^^"'. Adding this term in the potential is also instrumental in 
stabilizing the numerical minimizations. 

rj — (and disregarding the helicoidal 



The Hamiltonian (jfX.4|l reduces to that of the Yakushevich model setting ^ 
term); with this we get 

Tb = Ib En i^ni^^ + ^"X") 

Vt - 2Kt J2n [cos ^n^P cos A„x - 1] 

Vs = ^Ksr^ E„ 4(1 + a)2 [1 - cos A„ V cos A„x] 

Vp = ^Kpdl E„ 4(1 + /3)2 [1 - 2 cos t/jn cos Xn + C0S2 Xn] 

K,= 0; 

Also note that in this case Vt and Vs differ just by a multiplicative function. 

The typical energies involved in the different interactions are given by the coefficients in Eq. IJIX.4|) . Et — 2Kt, Eg = 
^Kgr^, Ep — ^Kpdf^, Eh — K^, E^ = Kwi which represent, respectively the typical torsional, stacking, pairing. 



(IX.5) 



helicoidal and confining energies. Using the values of the physical parameters given in the tables HI If If I and choosing 
Ew in order to keep the confining energy at least a full order of magnitude smaller than any other one, we get for the 
typical interaction energies the values given in table IVl 

In order to work with dimensionless quantities, throughout this section we will measure energies in terms of Ep — 
{l/2)Kpd\ = 4.0 • lO^KJ/mol. Using the values of the kinematical parameters given in HTl and those of the dynamical 
parameters given by table Hill the dimensionless coupling constants turn out to be 



gt = Et/Ep = 0.65 , g, ^ E^/Ep = 7.2 , gp = 1 
gh = Eh/Ep = 0.026 , g^ = E^^/Ep = 0.001 . 



(IX.6) 



Note that Eq. IjIX.Sj) implies that, in the limit S, = f] = 0, the Yakushevich couplings {K,g) and our coupling 
constants are related by 



K = 2(l + /3)2gp, g ^ 2gt + 4(1 + a)^ 



this also yields 



_g_ ^ gt + 2(1 + a)^gs ^ gt + 7.4g, 
K (l + /3)% - 2.3gp 
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FIG. 7: Stationary solitonic solutions of our model with energy E — 80.06 Ep of topological numbers (1,0) (thick line) compared 
with the solitonic solutions of the Yakushevich model of energy E — 75.56 Ep (thin line). Upper left (a): the angle V; upper 
right (b): the angle x; lower left (c): the angle 5; lower right (d): the angle rj. The thin line segments visible in (b) show the 
small difference between the profile of the solitons of our and of the Yakushevich model. 
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FIG. 8: Stationary solitonic solutions of our model with topological numbers (0, 1) for different values of the normalized 
couplings. The angle i/) is depicted on the left whereas ^ is depicted on the right. The soliton relative to the physical coupling 
constants with energy E = 189.9i5p (thick line) is shown together with those relative to the coupling constants gt = 0, Qs ~ 46 
with energy E = 492.4_Ep (thin dashed line) and gt = 345, gs = (thin line, E = 388.5Ep) to show how the profile would 
change at the increasing of the coupling constants in the two extreme cases of negligible torsional or stacking interactions. 



Most of the statements made in the previous section for the Yakushevich Hamiltonian apply almost verbatim to 
our case. We obtain an approximate profile of a soliton, subject to the boundary conditions 1)VII.14(I . by minimizing 
numerically the Hamiltonian through the "conjugate-gradient" algorithm, in particular through its implementations 
in the GSL J23 and in the Numerical Recipes |35l |. 

To enforce a particular topological type (p, q) for the soliton under study we fix the angles at the extremes of the 
chain so that ip-oo = X-oo — and V'+oo = 27rp, x+oo — 27rg, while the non-topological angles are requested to be 
identically zero at the extremes. As "starting point" for the the algorithm (see above) we use the natural choice |^ 



V'n = 7rp(l + tanh(/3(2n - N))) , 
Xn = 7rg(l + tanh(/3(2n - N))) , 

C« = Vn = , 



(IX.9) 



where /? is a parameter used to adjust the profile of the initial configuration (the starting point) and A'' is the number 
of sites on the chain. The number N is of course much smaller than in real DNA (usually A^ = 4000 in our simulations) 
but big enough to ensure that ip and x s-re constant at the beginning and the end of the chain within the numerical 
precision of our computations. 

Like in previous case, there is a threshold for the coupling constants that must be surpassed for the solitons to be 
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stable; in Fig. |Blwe show the region of instabihty for sohtons in the {gt,gs) plane when we fix the values of the other 
coupHng constants to be gh = 0.026, gp = 1, g^ = 0.001. As above, in the case of solitons of topological type (1,0) 
and (0, 1) we always reach the same minimum - within 10~^ in the energy and 10~^ in the angle - while f3 varies 
across almost two orders of magnitude, provided that /3 > 4 to avoid falling on the step solution. 

The (1, 1) soliton also shows the very same behavior as for the Yakushevich Hamiltonian, namely a different local 
extrema for every value of /? except for a short interval 4 < /? < 5.5 and for a range of energies wider - within 2% 
- about 160K; in the latter we get the same stable behavior observed for the (1,0) and (0, 1) solitons. In this case 
again, as in the Yakushevich case, the sensitivity of the numerical solitonic solutions to the values of the parameter /? 
could be an indication of the existence of many, almost degenerate, solitonic solution for topological numbers different 
than (1,0) or (0,1). We would like to stress that that the existence of different, almost degenerate, local minima is 
a typical feature of most bio-physical systems. However, our results concerning these point have to be regarded just 
has an indication. Further investigations, in particular at the analytical level, are needed in order to draw a definite 
conclusion. 

In Fig. Owe plot the (1, 0) soliton of our model with the physical values of the normalized coupling constants given 
by Eq. (|IX.6|I and compare them with those obtained for the Yakushevich model. The profiles of the topological 
angles change very little from the corresponding profiles of the Yakushevich solitons. 

We have also investigated the deformation of soliton profiles when adjusting selected parameters of our Hamiltonian. 
First, in order to see how the shape of the soliton changes upon increasing the strength of the torsional/stacking 
interactions, in Fig.|S|we compare the profiles of the (0, 1) solitons with profiles corresponding to g/K ~ 150 (sec Eq. 
(|IX.8|I . i.e. the couphng constant used in |57|. 

As it is not completely clear how to separate the interaction strength between torsional and stacking interactions 
(for our choice of physical constants in Sect. 0) we have used about the smallest reasonable value for Kt- We present 
the profiles corresponding to the two extreme possibilities: the one in which we put all the strength in the backbone 
torsional interaction (gt = 345, gs = 0), and the one in which we put all of it in the bases stacking {gt = 0, gs = 46). 
The effect is the widening of both the soliton and the non-topological profiles by roughly a factor 4 in the first case 
and of a factor 2 in the second case. 

In Fig. 1^ we compare the profiles of the soliton (1, 1) with those obtained by using the correct distance function 
for Vp, namely by replacing gpp^ with gp{p — do/dh)'^ (where do — 2 A is the equilibrium distance between two bases 
in a pair 7,2j), and by varying the helicoidal interaction term. No relevant changes are detected in the first case: 
relative differences in energies and angles are of the order of lO^'^ in energy and 10~^ in the angles; even increasing 
the base-pairs distances by two orders of magnitude these results do not modify the situation. 

As for the helicoidal term, we get variations of the same order of magnitude as above if we simply turn it off. If we 
instead increase the coupling constant by one order of magnitude (gh = 0.26), then we get energy and angles changes 
of the order of 10~^ and by increasing it to gh = 1 we arrive to changes of the order of 10" in both the angles and the 
energy. 

Raising g^ up to g^ — 2 leads to the disappearance of the soliton; it seems reasonable to argue that this is due to 
such an interaction favoring a sharper transition between limit behaviors, so that the discreteness effect discussed in 
the previous subsection arises. 

C. Discussion 

The numerical analysis we have performed shows the existence of solitonic solutions of our composite DNA model. 
The profiles of the topological solitons - in particular, the part relating to the topological degree of freedom - of 
our model are both qualitatively and quantitatively very similar to those of the Y model. This means that the most 
relevant (for DNA transcription) and characterizing feature of the nonlinear DNA dynamics present in the Y model 
is preserved by considering geometrically more complex and hence more realistic DNA models. 

Moreover, the topological soliton profiles of our model seem to change very little when either the physical parameters 
change in a reasonable range or also the form of the potential modelling the pairing interaction is modified to a more 
realistic form. In particular, the form of the topological solitons are very little sensitive to the interchange of torsional 
and stacking coupling constant. 

This feature add other reasons why the Y model, although based on a strong simplification of the DNA geometry, 
works quite well in describing solitonic excitations. The Y model, indeed, does not distinguish between torsional and 
stacking interaction; but, as we have shown, this distinction is not relevant - at least as long as one is only interested 
in the existence and form of the soliton solutions. The "compositeness" of our model becomes relevant - and rather 
crucial - when it comes on the one hand to allowing the existence of solitons together with requiring a physically 
realistic choice of the physical parameters characterizing the DNA, and on the other hand to have also predictions 
fitting experimental observations for what concerns quantities related to small amplitude dynamics, such as transverse 
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FIG. 9; Comparison of stationary solitonic solutions of our model with those obtained using a modified pairing potential Vp. 
Upper left (a): the angle ip; upper right (b): the angle xi lower left (c): the angle (,; lower right (d): the angle r]. The thick 
line represent a soliton with E = 80.06 and topological numbers (1, 1)). The thin dashed line gives the profile for the same 
soliton with energy E = lAAlEp and with the "correct" pairing potential Vp = gp{p — do / dhf' . The latter solitonic solution has 
been derived taking do = 3.2dh, namely a order of magnitude bigger than its physical value, to enhance the profile differences 
(an almost identical profile is obtained if we suppress the helicoidal term from the Hamiltonian) . The thin continuos line 
{E — 102.9£'p) is the profile we get by increasing the helicoidal term to g^ = 1. 

phonons speed. In other words, the somewhat more detailed description of DNA dynamics provided by our model 
allows it to be effective - with the same parameters ~ across regimes, and provide meaningful quantities in both the 
linear and the fully nonlinear regime. 

The solitonic solutions of the composite model share also two other features with those of the Y model, namely the 
presence of a numerical instability and the existence of quasi-degenerate solutions for solitonic configurations with 
higher topological numbers. 

We expect that the model considered here is the simplest DNA model describing rotational degrees of freedom 
which, with physically realistic values of the coupling constants and other parameters, allows for the existence of 
topological solitons and at the same time is also compatible with observed values of bound energies and phonon 
speeds in DNA. 

X. SUMMARY AND CONCLUSIONS 

Let us, in the end, summarize our discussion and the results of our work, and state the conclusions which can be 
drawn from it. 



A. Summary and results 

Following the work by Englander et al. [l3| , different authors have considered simple models of the DNA double 
chain - focusing on rotational degrees of freedom - able to support dynamical and topological solitons |53 , supposedly 
related to the transcription bubbles present in real DNA and playing a key role in the transcription process. 

These models usually consider a single (rotational) degree of freedom per nucleotide I Sfil. albeit models with one 
rotational and one radial degree of freedom per nucleotide have also been considered 0, II, lU Ull ^M i^ 3,n extension 
of "purely radial" models "Sy, ^41], considered in the study of DNA denaturation). A simple model which has been 
studied in depth is the so called Y model 52]. This supports topological solitons (of sine-Gordon type) and provides 
correct orders of magnitude for several physically relevant quantities |26l |: on the other hand, the soliton speed remains 
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essentially a free parameter [22, 123 , and the speed of transverse phonons can be made to have a physical value only 
by assigning unphysical values to the coupling constants of the model [53 ■ 

Here we have considered an extension of the Y model, with two degrees of freedom - both rotational - per nucleotide; 
one of these is associated to rotations of the nucleoside (unit of the backbone) around the phosphodiester chain and is 
topological - i.e. can go round the 5*^ circle ~ while the other is associated to rotations of the attached nitrogen base 
around the Ci atom in the sugar ring, and due to sterical hindrances is non-topological - i.e. rotations are limited to 
a relatively small range around the equilibrium position. We denoted this as a "composite Y model" . 

Several parameters appear in the model; some of these are related to the geometry and the kinematics of the DNA 
molecule, while other are coupling constants entering in the potential used to model intramolecular interactions. We 
have assigned values to the first kind of parameters following from available direct experimental observations, and for 
the second kind of parameters we used experimental data on the ionization energies of the concerned couplings and 
the form of the potentials appearing in the model. That is, these parameters were not chosen by fitting dynamical 
predictions of the model; see sect. V for detail. We have first considered small amplitude dynamics (sect. VI); this 
yields the dispersion relations and produced some prediction on the phonon speed and the optical frequency for the 
different branches. These prediction are a first success of the present model, in that it was shown in sect. VI that 
taking the physical order of magnitude for the pairing, stacking, torsional and helicoidal interaction, one can obtain 
the order of magnitude of the experimentally observed speed for transverse phonon excitations and the frequency 
threshold for the optical branch. In particular, using a physical value for the stacking and torsional interaction energy 
we get a value for the transverse phonon speed which is about three times the "correct" one. It should be considered, 
in looking at this value, that we modelled the intrapair interaction by a very simple and non realistic potential (with 
the aim of both keeping computations simple and allowing direct comparison with the standard Y model by making 
the same simplifying assumptions as there). As for comparison with standard Y model, hence for an evaluation of 
the advantages brought by considering a more articulated geometry of the nucleotide, it should be recalled that the 
numerical computations of Yakushevich, Savin and Manevitch (57j | (which we repeated, and fully confirmed) show in 
order to obtain the experimentally observed speed for transverse phonon excitations in the framework of the standard 

Y model, one should take a coupling constant for the transverse intrapair interaction which is about 6000 times the 
"correct" one. 

We passed then to consider the fully nonlinear regime, and in particular to look for solitonic-like travelling ex- 
citations. These should have smooth variations on the space scale of nucleotides, hence we passed to a continuum 
description and field equations; by using the chain exchange symmetry, we considered fully symmetric and antisym- 
metric reductions, see (|VII.5|I and (jVII.7|l . By a travelling wave ansatz we reduced these to a system of two coupled 
second order ODEs for t9(z) and (p{z), see (jVII.10|) and ljVII.ll|) . Here -d is the topological angle, i.e. the variable 
associated to the topological field, and (p is the non topological angle, i.e. the variable associated to the non topological 
field. 

The finite energy condition ljVII.12|l . (|VII.13|I requires that the solutions to this system of ODEs satisfy certain 
limit conditions (see l|VII.14l ) . These in turn imply that solutions satisfying them can be classified according to two 
topological indices (winding numbers for the topological fields; in the symmetric or antisymmetric case, one index is 
enough to determine the other as well). 

We have also shown that the standard Y model can be obtained from the composite Y model by a limiting procedure 
(sect. VII); this also reduces the solitons of the composite model to solitons of the Y model. However, the limiting 
procedure requires that a certain condition is satisfied, see eq. ()VIII.5|I . and this in turn constraints the speed of 
solitonic excitations; see (jVIII.6|l . Thus, the requirement to obtain the standard Y solitons in a certain limit fixes the 
speed of solitons; the resulting speed is just the speed of long waves as determined by the dispersion relations. 

Finally, in sect. IX we conducted a careful numerical investigation of the simpler soliton solutions for the composite 

Y model. We preliminarily checked our numerical routines on the standard Y model and fully confirmed the results 
of Yakushevich, Savin and Manevitch |57| , also confirming certain instability phenomena they reported. We then 
considered the solitons for the composite Y model with the value of parameters descending from their physical 
meaning (i.e. with no parameter fitting), confirming their existence, properties and stability. We also showed how 
the profile of the soliton component corresponding to the topological degree of freedom is extremely similar to the 
standard Y soliton with same topological numbers. We considered next the stability of these soliton solutions upon 
varying the parameters of the model, and observed that as in the standard Y case there is a stability threshold. Thus, 
the existence and stability of soliton solutions for physical values of the parameters is a nontrivial prediction. 

B. Discussion and conclusions 

The composite Y model considered here retains all the favorable features of the standard Y model. At the same 
time, its more articulated geometry allows at the same time - and with physical values of the coupling constants 
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and other parameters entering in the model - to reproduce relevant value of physical quantities related to the linear 
regime (such as speed of transverse phonon, which was a critical test for the standard Y model) and support stable 
soliton solutions. 

Further, and at difference with the standard Y model, it provides a precise prediction for the soliton speed; this 
is quite reasonable physically, as it corresponds to the speed of long waves as obtained from the dispersion relations 
for the model. Thus, our model passed some ~ in our opinion, significant - quantitative test and provides precise 
predictions. 

It should also be stressed that we used - both to simplify the mathematics and to have a direct comparison with 
the standard Y model - a very simple form for the intrapair coupling potential (and at some stage also resorted 
to the "contact approximation" to get simpler formulas, again as in the standard Y model treatment). It is quite 
conceivable that adopting a more realistic potential will provide better estimates of relevant physical quantities, in 
particular for quantities related to the linear regime. However, experience recently gained with the standard Y model 
|24L 1251 suggests that the predictions related to the fully nonlinear regime are rather little sensitive to the detailed 
form of the potential and to adopting or otherwise the contact approximation; we are thus rather confident that future 
work with more realistic potentials will confirm the results obtained in the simple setting considered here. 

Finally, we would like to remark a very relevant feature of our model. All the DNA models amenable to analytic 
treatment look at homogeneous DNA, albeit the genetic information lies precisely in the non homogeneous part of 
the DNA (i.e. the base sequence; bases have rather different physical and geometrical characteristics). Our discussion 
was no exception, and we considered identical bases with "average" geometrical and physical characteristics. But, the 
degrees of freedom we considered for each nucleotide are one concerned with the uniform part of the DNA molecule 
(the nucleosides), the other with the non homogeneous part (the base sequence). Moreover, it turned out that - for 
what concerns soliton excitations - the most relevant role is played by the (topological) variables associated to the 
uniform part, which are directly at play in the topological solitons, while the (non topological) variables associated 
to the non uniform part are in a way just accompanying the soliton. 

This suggests that, within the framework of composite models, the non homogeneous case can be studied as a (non- 
singular) perturbation of the homogeneous case; needless to say, by this we mean an analytical ~ albeit approximated 
- study, and not just a numerical one. This represents a significant advance with respect to what is possible with 
simple models considered so far. 
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broken and then built again thanks to quantum fluctuations. 
[71] There are indeed qualitative arguments suggesting a degeneration of minimizing conflguration whenever the topological 

numbers are not (1,0) or (0,1); we will not discuss these arguments nor the matter here. 
[72] In order to avoid any confusion, we stress that here the "distance" between bases refers e.g. to the N-H distance in a 

N-H-0 hydrogen bond; the total length of the bond is about 3A, and often one refers to this as the interbase distance. 

Here instead we consider the H atom - which lies at about lA from the nearer atom in the H bond - as part of one of the 

bases and hence consider the distance of it from the other atom as the interbase distance. 



